Subtractive genomics profiling for potential drug targets identification against Moraxella catarrhalis

Moraxella catarrhalis (M. catarrhalis) is a gram-negative bacterium, responsible for major respiratory tract and middle ear infection in infants and adults. The recent emergence of the antibiotic resistance M. catarrhalis demands the prioritization of an effective drug target as a top priority. Fortunately, the failure of new drugs and host toxicity associated with traditional drug development approaches can be avoided by using an in silico subtractive genomics approach. In the current study, the advanced in silico genome subtraction approach was applied to identify potential and pathogen-specific drug targets against M. catarrhalis. We applied a series of subtraction methods from the whole genome of pathogen based on certain steps i.e. paralogous protein that have extensive homology with humans, essential, drug like, non-virulent, and resistant proteins. Only 38 potent drug targets were identified in this study. Eventually, one protein was identified as a potential new drug target and forwarded to the structure-based studies i.e. histidine kinase (UniProt ID: D5VAF6). Furthermore, virtual screening of 2000 compounds from the ZINC database was performed against the histidine kinase that resulted in the shortlisting of three compounds as the potential therapeutic candidates based on their binding energies and the properties exhibited using ADMET analysis. The identified protein gives a platform for the discovery of a lead drug candidate that may inhibit it and may help to eradicate the otitis media caused by drug-resistant M. catarrhalis. Nevertheless, the current study helped in creating a pipeline for drug target identification that may assist wet-lab research in the future.


Introduction
Moraxella (Branhamella) catarrhalis, previously recognized as Neisseria catarrhalis or Micrococcus catarrhalis is a gram-negative and an aerobic diplococcus that is predominantly reported to be found in an upper respiratory tract commensal. M. catarrhalis has emerged as a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 notorious bacterial pathogen in past 20 to 30 years [1]. It causes acute otitis media in infants and exacerbation of chronic bronchitis in adults. It is typically associated with various infections associated with other deadly pathogens like Streptococcus pneumoniae or Haemophilus influenzae being encountered in up to 50% of cultures [2]. M. catarrhalis is a prime cause of various active infections in hosts with weakened immune systems, encompassing pneumonia, endocarditis, septicemia, and meningitis [1]. Furthermore, hospital outbreaks of M. catarrhalis-related respiratory disease have been characterized and classified it as a nosocomial pathogen. For decades, M. catarrhalis was thought to be a harmless commensal since little is known about its pathogenic features and virulence factors despite the fact that research in this field has expanded in recent years [3].
The classical antibiotic treatment alleviates the clinical burden. However, unrestricted antibiotic use is a major factor in the rapid progress of antibiotic-resistant bacteria, which has reduced the number of viable antimicrobial options [4]. In recent past the antimicrobial resistance has been rising dramatically. Acute Chronic Obstructive Pulmonary Disease (COPD) and other respiratory diseases caused by M. catarrhalis are notoriously difficult to treat because of the rising MICs and antimicrobial drug resistance [5].
The genome and proteome analysis aids in the identification of several potential drug targets for the treatment of highly pathogenic diseases. Arguably comparative and subtractive genomic analysis is an exigent job because of its high dimensional data analysis. Interestingly, the arrival of the post-genomic era and pathogen whole-genome sequences opened multiple avenues for the methodologies such as comparative subtractive genomics to design new drugs and vaccine targets. This cost-effectiveness has unlocked the new pathways for finding potential drug candidates and it has accelerated the process of drug discovery, expanded the number of treatment options, and reduced the failure rate in clinical trial process in later stages [6]. Computational approaches enabled the identification of potent therapeutic targets against such pathogens [7]. The method has already been used to successfully prioritize and predict therapeutic targets for Clostridium botulinum [8], Mycoplasma pneumoniae [9], Rickettsia [10], Neisseria gonorrhoeae [11], Salmonella typhi [6], and Shigella dysenteriae [12].
In the present study, genomics data in BBH18 of Moraxella catarrhalis was investigated to find unique therapeutic targets and therapeutic candidates. The study includes comparative and subtractive genomics analysis approach, Protein-Protein Interaction (PPI) network analysis, essentiality, drug ability of target proteins and ADMET properties. Eventually, certain limitations from previous studies against M. catarrhalis such as consideration of hub nodes, and conserved drug targets are covered in this study. Future study may involve the development of antibacterial lead compounds against these shortlisted potential drug targets.

Material and methods
Subtractive genomics approach was employed for the drug target prioritization against M. catarrhalis which holds the clinical and biological importance. The BBH18 strain was chosen to identify the potent drug target and candidate. There was not much reported work against this specific strain in the ground of in silico drug target identification and also it was the only reference strain available for M. catarrhalis, that's why it was selected for further study. Several databases and tools as illustrated in the flow chart in Fig 1 were used for the determination of therapeutic targets.

Retrieval of proteomes of pathogen and host
The whole proteome of Moraxella catarrhalis BBH18 and Human host both were obtained from the Universal Protein Resource (UniProt) database [13]. Additionally, the Database of Essential Genes (DEG database) [14,15] was used to screen the drug targets essentiality, and the Drug Bank database Version 5.1.8 [16] was used to investigate the drug ability of proposed targets. Moreover, the Virulence Factor Database (VFDB) [17] was used to curate information about virulence factors of M. catarrhalis whereas, the ARG-ANNOT (Antibiotic Resistance Gene-ANNOTation) AA V6 July 2019 [18] was used for the detection of already existing and putative new Antibiotic Resistance (AR) genes in pathogen genomes [18].

Subtractive genomics approach
Subtractive Genomics is an extensively employed approach that is used to subtract the sequences between the host and pathogen proteomes and metabolic pathways to provide details for a set of proteins that are required by the microorganism but do not exist in the corresponding host. Subtractive Genomics has an active role in identifying unique and essential potent drug targets for pathogen to survive without changing the systematic metabolic pathways of the hosts [19].

Removal of paralogous protein sequences
The complete proteome of Moraxella catarrhalis BBH18 was eradicated at 60% threshold using Cluster Database at High Identity with Tolerance i.e. CD-HIT [20,21].The proteins possessing the sequence identity greater than 60% are paralogous to each other. The complete sequences of paralogous i.e. duplicates were removed keeping the non-paralogous sequences only for the downstream analysis.

Identification of non-homologous protein
Further, the set of proteins retrieved after removal of paralogous were subjected to BLASTp [22] with the expectation value (i.e. E-value) cut-off of 10 −5 against Homo sapiens proteome [21]. The BLASTp generated results i.e. 'Hits Found' (homologous sequences amid pathogen and the host) and 'No-Hits Found' (Non-homologous sequences). The non-homologous sequences with no resemblance to the human host were opted for further analysis.

Identification of essential non-homologous genes
Essential proteins of any organism are those proteins that possess a significant role in cellular metabolism [23]. Hence, BLASTp of non-homologue M. catarrhalis proteins was performed against DEG. Strictly, essential proteins were sorted out in Moraxella catarrhalis by keeping a threshold E-value of 10 −100 . To screen out essential genes, a minimum cut off score of 100 was set [21]. It resulted in the protein data set that were non-homologous as well as essential to M. catarrhalis.

Druggability of essential proteins
Later, nonhomologous essential proteins were evaluated using BLASTp against Food and Drug Administration (FDA) approved proteins that served as therapeutic targets and obtained from the DrugBank. The evaluation was performed with E-value cut-off of 10 −5 for the discovery of drug-target-like ability of identified essential proteins for the prioritization of novel and unique therapeutic targets [21].

Identification of essential virulent proteins
Furthermore, the selected genes were subjected to BLASTp against the Virulence Factor Database by setting E-value cut-off to 10 −5 for determining the proteins possessing the highest virulence factor [24].

Resistance proteins analysis
For further structure-based studies, only those proteins that possess high antibiotic resistance were required. The BLASTp analysis was performed for essential virulent proteins against ARG-ANNOT by setting E-value cut-off to 10 −5 [24]. The resultant data set was of homologous essential proteins possessing high antibiotic resistance and was opted for further structure-based studies.

Identification of subcellular localizations
Every protein has a distinct function at a specified locality. These regions are crucial because proteins are distributed to specific regions in the cell once they are released. Failure of the proteins to move to their adequate location may cause a variety of disorders. Therefore, PSORTb version 3.0.2 [25] and CELLO2GO [26] were used to determine the subcellular localization of all essential, drug like, and nonhomologous proteins. The underlying principle behind Subcellular Localization (SCL) is to run a search of BLAST over all nonhomologous proteins required against the proteins with a specified subcellular location. These tools classified proteins into distinct types based on their cellular location: cytoplasm, cytoplasmic membrane, inner membrane, outer membrane and periplasmic membrane, extracellular region, and undetermined [6,24].

Structure prediction and homology modelling
The shortlisted proteins from the subtractive genomic approach were evaluated and searched for their structures in the Protein Data Bank (PDB). The BLASTp was used to find a suitable template for protein structure modelling. If there is a lack of 3-dimensional structure, the protein structure can be modelled using the Swiss Model-Homology Modeller [27]. In the lack of an experimentally determined crystal structure of the protein, the homology modelling is the most precise and efficient method for constructing protein structures i.e. 3D. It works by comparing the sequences of proteins in the Protein Data Bank [24].

Validation of protein structure
The modelled structures were validated using various tools based on their respective principles to perform the docking experiment against shortlisted proteins. i.e. PROCHECK [28] to evaluate the stereochemistry composition of a protein structure by analyzing its residue-by-residue geometry as well as the entire structural geometries of the protein. The ERRAT i.e. empirical atom based analyzing tool [29], and PSIPRED were used to estimate the β-sheets, α-helices, and random coils (secondary structure) of shortlisted proteins i.e. sequence based validation [30]. On the other hand, PROSA web server [31] which is employed to validate the modelled protein structure against the available structures supplied from PDB on the basis of Z-score [32].

Ligand and active site prediction
As the structure was modelled, it was required to find the active site over which the ligand could attach to perform its tasks and key roles. In the absence of any ligand in the active site, the ligand was predicted using the template protein of modelled structure obtained as a result of BLAST search. This active site can be chosen for the docking of ligand against respective proteins.

Molecular docking studies
Molecular docking is a computational technique used to predict non covalent binding of a macromolecule i.e. protein (receptor) and a ligand, initiating with their unbounded structures acquired from homology modelling [33]. The most effective ligand in molecular docking has the lowest docking score for its target protein. Using standard docking parameters of 10 times Lamarckian GA settings resulting in 27,000 generations through AutoDock v4.2 [34]. In the docking experiment, the modelled protein function as the target and the identified compound acted as the ligand [35].

Redocking and virtual screening for the identification of novel drug candidates
The docking parameters were validated first by re-docking an ADP co-crystal ligand discovered within the binding site of histidine kinase. For the molecular docking, the conventional docking protocol was used with AutoDock. The ligand was docked and implemented using 250 times Lamarckian GA settings, resulting in a maximum of 27,000 generations and 2,500,000 evaluations. [36]. The re-docking was performed to assess the performance of docking program for its capability of reproducing the same crystal conformation of the bound ligand [36].
Further on, virtual screening of 2000 compounds from the ZINC database [37] was performed against the histidine kinase i.e. subjected drug target protein to identify novel drug candidates using AutoDock Vina [38]. These compounds were selected based upon the range of their molecular weight from 150 to 350 Da (Dalton) as according to the Lipinski's rule of 5, molecular weight should be >500 Da and as well as due to their easy availability from inhouse library (institute's library). For the grid points, 72 on X-axis, 112 on Y-axis and 104 on Z axis were selected whereas the parameters for grid center were selected at 55.3, 30.378, and 26.716, respectively [38]. The AutoDock Vina PDBQT Split 1.1.1 [39] was used to split the prepared PDBQT library into the required file. Virtual screening was carried out using the default parameters applied for docking study.

Post docking analysis
The Molecular Operating Environment (MOE 2019.01) [40] was used to assess the docked ligand-protein interaction and depict the ligand's H-bonds and hydrophobic interactions with the docked protein inside a range of 5 Å. Whereas mmff94s force field was used for energy minimization.

Physiochemical property profiling and toxicity predictions
The physio-chemical properties (i.e. ADME properties) of a ZINC products library was examined in order to determine the important characteristics and parameters that may have a role in influencing the bioactivities. Estimation of compound drug-likeness is component of the physio-chemical analysis (e.g., Lipinski's rule, lead resemblance), molecular weight, compound interaction with biological environment (e.g., cell permeability, skin permeability, intestinal permeable), biopharmaceutical properties (i.e., pKa value, solubility, etc.), interaction with plasma proteins, and drug bioavailability. Moreover, the pkCSM [41] and SwissADME [42] tools were used to analyze the Absorption, Distribution of Drug, Metabolism, and Excretion (ADME) qualities as well as a number of factors related to the pharmacological action of the drug [43].

Prediction of protein-protein interaction of identified drug target
Histidine kinase was the identified drug target protein. It was found to be essential and with cytoplasmic properties predicted through Database of Essential Genes (DEG) was evaluated for interactions with other proteins. The STRING Version 11.5 (Search Tool for the Retrieval of Interacting Genes/Proteins) [44] is a database containing protein interactions that include both verified and anticipated interactions. Interactions can be both direct (physical) and indirect (functional). The STRING integrates interaction data from these sources statistically for many species and transmits information among these organisms as required. The database currently comprised of 5,214,234 proteins from 1133 species [45]. It was subjected to determine whether the identified drug target can act as hub protein and to validate their functional interactions [46]. These PPIs are classified as hub proteins using node degrees and clustering coefficients. Medium confidence value i.e. 0.40 (by default setting) was set as the minimum required interaction score for the PPIs.

Subtractive genomics approach
The current study is an application of an efficient subtractive genomics approach as exhibited in Fig 1. The Fig 1 depicts the complete series of steps as well as the tools and databases used for the identification of potent drug targets against Moraxella catarrhalis BBH18. Furthermore, the in-silico evaluation exhibited that the complete proteome of Moraxella catarrhalis BBH18 was comprised of a total of 1881 proteins. The step-wise filtering of the proteins during the current study was shown in Table 1.

Removal of paralogous protein sequences
The CD-HIT tool resulted in a total of two paralogous proteins among 1881 proteins. Subsequently, the remaining 1879 proteins were found as non-paralogous.

Non-homologous proteins identification
Furthermore, these proteins were then subjected to BLASTp analysis against the human proteome to opt non-homologous proteins to Human proteome. By sorting out the BLASTp results, total of 519 proteins showed similarity with human proteins and these proteins were refrained for the downstream analysis as they may cause cross-reactivity and undesired toxicity in humans. Therefore, for further analysis, a total of 1360 non-homologous proteins were opted.

Identification of essential non-homologous genes
Moreover, the BLASTp search was performed against the DEG that comprises of a collection of essential genes found in a wide variety of pathogenic and non-pathogenic organisms (both pro-and eukaryotes). A total of 91 proteins were identified as essential proteins required for the viability of M. catarrhalis and could be proposed as the potent drug targets.

Drug ability of essential protein
Additionally, above 91 proteins were further subjected to the BLASTp analysis against Drug Bank Database. Only proteins with considerable similarities in sequence to FDA-approved therapeutic targets were chosen and the rest were omitted through dataset. The BLASTp alignment search was resulted in 38 druggable proteins of M. catarrhalis.

Identification of essential virulent and antibiotic resistance protein
Further on, these 38 proteins were evaluated using BLASTp analysis against the Database of Virulence factor. Only 14 of which were classified as essential virulent proteins i.e. proteins with high virulence factor of M. catarrhalis. However, only four proteins were identified as antibiotic resistance out of 14 shortlisted proteins against ARG-ANNOT database.

Subcellular localization prediction
In subtractive genomic approach, the PSORTb was employed to find the subcellular location of the nonhomologous essential proteins. In this research, among 38 essential drug-like proteins, 79% proteins were predicted to be found as cytoplasmic, 18% of them were anticipated to be in cytoplasmic membrane proteins, and only 3% were identified as outer membrane protein whereas according to CELLO2GO results, 65.3% proteins appeared to be cytoplasmic, 24.5% were shortlisted as the inner membrane proteins, 4.1% periplasmic proteins, 4.1% were classified as outer membrane proteins and 2% were found to be extra cellular proteins as shown in Table 2. The distribution of all essential proteins in M. catarrhalis was depicted in Fig 2A and 2B.

Novel drug targets prediction
In this study, 38 potential drug targets were shortlisted as shown in Table 3. Because they are nonhomologous and non-paralogous, therefore these 38 proteins may be considered as promising therapeutic targets. Furthermore, among 38 potential drug targets, four of which were classified as antibiotic resistance proteins. Among them, one protein was shortlisted as essential, non-homolog, with high virulence factor and antibiotic resistance, drug able target against M. catarrhalis i.e. sensor histidine kinase (D5VAF6), and therefore, proceeded to structurebased studies. Fig 3 showed the comprehensive outcome of the current study.

Significance of selected protein
Sensor histidine kinase is an ATP-binding signal transduction protein found in M. catarrhalis's two-component system [47]. These sensor histidine kinases detect changes in the environment (such as stress or the presence of a drug) surrounding the pathogen and transmit the signals inside that dynamically adjusts the internal mechanism of bacterial cells, preparing them to take advantage of these changes. Changes in these sensor kinases have been linked to resistance to many antibacterial drugs such as cefotaxime [48].

Homology modeling of shortlisted drug target
Histidine kinase's 3-dimensional structure (one amongst four nominated and shortlisted) was not available in Protein Data Bank (PDB) PDB. As a result, the protein's FASTA sequence from the NCBI database possessing the accession number D5VAF6 as specified in the database was used for the homology modelling. 4CTI, 4BIU, and 4BIZ were the respective structures of PDB that could be possible templates with percent identities of 34.67%, 27%, and 26.69%, respectively. Ultimately, the structure 4BIZ with a 26.69% sequence similarity and 54% query coverage was selected as a template due to its similarity and availability of ligand, and the structure was effectively modelled as shown in Fig 4.

Modelled structure's validation
Various tools were used to verify the modelled protein structure, i.e.

I) Confirmation of Proteins through PSIPRED
The PSIPRED was used for the secondary structure validation of the protein. It validated the structure on the prediction of helices and beta sheets formation as shown in S1 Fig

II) PROCHECK Validation of Proteins
The PROCHECK was used to generate a Ramachandran plot for the modelled protein structure. The Ramachandran plot showed about 91.0% residues found in the favorable region, having one residue in the disallowed region, 16 residues in the additionally allowed regions whereas 4 in generously allowed regions responsible for about 6.8% and 1.7%, correspondingly as shown in S2 Fig.

III) ERRAT Validation of Proteins
The ERRAT tool was used to validate the unbounded statistics between two atoms conformation in the structure. It resulted in the quality factor of about 89.147% as shown in S3 Fig.

IV) PROSA web Validation of Proteins
The ProSa web server tool was used to calculate the quality of the 3-D structures of proteins in terms of Z-score in the structure of modelled protein. The resulted Z-score is -7.33. The Z-score was calculated using NMR Spectroscopy (dark blue) and X-ray crystallography

Protein-ligand interactions study through docking
The protein-ligand interactions were analyzed through AutoDock Vina.

ii) Molecular Docking with AutoDock
The AutoDock 4.2 was used for the docking study of histidine kinase. By selecting 10 algorithms run along with setting the Lamarckian GA to 10 times, the ADP ligand was docked and a maximum number of evaluation steps of 2,500,000 proceeding the generation of 27,000. Binding of ligand in the active site of protein in various orientations and conformations was revealed because of AutoDock. Each conformation has a distinct binding energy, ranging from negative to positive. The lowest binding energy of -6.3 kcal/mol was aided in ranking the top conformation of ADP, since the lowest binding energy relates the ligand's spontaneous binding to the active site, and also forms a lower energy complex which is more stable.

Redocking and virtual screening for identification of novel drug candidates
The redocking validation of co-crystallized ligand yielded a binding energy prediction of -5.32 kcal/mol. Fig 5B depicts

PLOS ONE
structure was 1.660 Å indicating that the docking parameters could be implemented for virtual screening.
The modelled structure was docked with the ZINC library. The compounds were screened using the same parameters that have previously been used for AutoDock validation (redocking). As seen in Fig 6A, the screening led to the identification of 789 compounds known as highly ranked with binding energies ranging from -5.5 to -6.5 kcal/mol. Whereas only 1424 molecules exhibited favorable interactions with histidine kinase with energetics spanning between -5.5 to -9.3 kcal/mol, as illustrated in Fig 6B. These compounds may serve as leads in the future. Furthermore, Fig 6C showed three potent drug candidates that are identified for histidine kinase against 2000 compounds from ZINC. These identified potent drug candidates are ZINC09185674, ZINC03839141, and ZINC00631248 possessing the binding energies as -6.4, -6.2, and -6.2 kcal/mol, respectively.

Post docking analysis (i.e. interaction analysis of selected compounds with histidine kinase)
The post docking interaction analysis of shortlisted compounds was conducted to comprehend the identified mechanism of binding and pharmacological activity against histidine kinase. The rank order of docking depending on score and presented as following: ZINC09185674 >ZINC03839141 and ZINC00631248 possessing the binding energies as -6.4, -6.2, and -6.2 Kcal/mol, respectively.
The docking analysis for ZINC09185674 revealed considerable binding energy of −6.4 kcal/ mol. The ZINC09185674 was found to facilitate hydrophobic interactions only within binding cavity of histidine kinase. It mediates H-bonds as a hydrogen acceptor to Arg356 and Lys355 via the hydroxyl group and one H-bond as a hydrogen acceptor to Gln331 via the pyrrole ring's oxygen.
The binding score of ZINC03839141 was -6.2 Kcal/mol. The hydrophobic interactions within the binding cavity are depicted. As a hydrogen acceptor, it bridges two hydrogen bonds with Tyr339. The ZINC03839141 was also interacted ionically with Asp332.
With docking scores of 6.2 kcal/mol, ZINC00631248 has demonstrated strong hydrophobic interactions. Importantly, the reference compound (ADP) showed an ionic interaction with Arg428. It serves as a hydrogen donor Tyr368 and a hydrogen acceptor Glu407 in two Hbonds. through hydroxyl group that shows the pi-pi interaction. The redocked compound incorporating the modelled protein (2D and 3D interaction) is depicted in Fig 7. Table 4 showed the docked scores and reported types of bonds anticipated by the MOE tool for the identified compounds.

Physiochemical property profiling and toxicity predictions
The pharmacokinetic parameters of three chosen drugs were calculated using the online pkCSM tool based on Blood-Brain Barrier crossing capabilities, drug-likeness, toxicological analyses and ADME characteristics. The Lipinski rule of five was employed in the drug-likeness characterization.
To anticipate the compound's drug likeness, the SwissADME tool was employed. The two of three selected candidates have indicated zero violations to Lipinski's Rule of Five whereas one compound has indicated only one violation and showed acceptable drug-like properties. The results of ADME properties analysis including Water Solubility, Blood-Brain Barrier (BBB) Permeability, Human Intestinal Absorption (HIA), Skin Permeability, CaCo2 permeability, and Lipinski Violation of shortlisted three compounds are shown in Table 5.
The results of toxicity analysis i.e. Max Tolerated Dose (Human), Minnow toxicity, Skin Sensitization, Hepatotoxic, Ames test, Oral Rat Acute Toxicity (LD50), T. Pyriformis (Toxicity) are shown in Table 6. The table also includes Radar of the respective compound.

Prediction of protein-protein interaction of identified drug target protein
For the filtration and analysis of functional genomic data to annotate structural, functional, as well as evolutionary information on proteins, the proposed interaction could be utilized.
Histidine kinase's NCBI ID: D5VAF6 was submitted to the STRING database and found the interaction with other proteins in the Moraxella catarrhalis. The MCR _0156 represented the histidine kinase, and its minimal interactions with other proteins in their surroundings MCR_0386 (Two-component system sensor histidine kinase) with score of 0.721, MCR_0387 (Two-component system sensor histidine kinase) with score of 0.811, MCR_0405 (Tetratricopeptide repeat family protein) with score of 0.746, MCR_1062 (LuxR family transcriptional regulator) with score of 0.998, bioF (8-amino-7-oxononanoate synthase) with score of 0.844, csrA (Translational regulator CsrA) with score of 0.770, ompR (Two-component system response regulator) with score of 0.946, phoB (Two-component system phosphate regulon response regulator PhoB) with score of 0.896, phoR (Two-component system phosphate regulon sensor histidine kinase PhoR) with score of 0.894, anD RUmA (23S rRNA (uracil(1939)-C (5))-methyltransferase RlmD) with score of 0.760. The results showed that the histidine kinase (MCR_0156) protein has 482 edges, total 309 edges were shown to be expected, whereas number of nodes present are 78, and 12.4 is suggested as its average nodes degree. The PPI enrichment p-value is < 1.0e-16 with a local clustering score of 0.657 on average (Fig 8). These proteins are engaged in a variety of critical functions. Because of targeting Histidine kinase protein, the function of the other interacting proteins may be jeopardized. As a result, this protein might be used as a therapeutic target.

Discussion
In present day and age, the computational methods and approaches have gained considerable attention for the identification and development of potent drug targets [49]. Yet, highthroughput sequencing experimental data for the majority of infectious bacteria is currently unavailable, and efforts to define and identify essential drug targets have now being relied solely on bioinformatics predictions [50]. Because of the obvious rise in drug resistance among pathogens, in-silico subtractive genomic analysis has been widely used for strain-specific targeting for drug target identification [51].
In the current research, subtractive genomics approach was employed for the identification and prediction of potent drug candidates. The focus of the current research was one of the These proteins may result in the removal and destruction of pathogen from the host through effective drug candidates and vaccines. Finally, one enzymatic protein was opted as potential drug target against M. catarrhalis i.e., sensor histidine kinases (D5VAF6) involved in twocomponent system. It plays the key role for the bacteria's development and survival [52]. Histidine kinase has been reported in different studies as a potential drug target against various pathogens such as Mycobacterium tuberculosis [53], Salmonella enterica [54], Streptococcus Species [55], Bacillus subtilis and Staphylococcus aureus [53] and etc. But this study has uniquely reported histidine kinase for M. catarrhalis BBH18 as it was not documented as a drug target yet. Histidine kinase plays a major role in the two-component system of M. catarrhalis. It usually uses two-component signal transduction systems to translate extraneous and cellular signals into cell signaling. Because of the relevance of this protein's function, it might be used as a potential therapeutic target in future [56].
Furthermore, ZINC library (>2000 compounds) was screened against the selected drug target to identify potential inhibitor. Following the screening process, 789 compounds having binding energies between -5.5 and -6.5 kcal/mol were identified as promising candidates. Only 1424 molecules showed preferential interactions with histidine kinase (energetics -5.5 to -9.3 kcal/mol). ADMET profiling was performed to substantially docked compounds to predict highly potent drug-like molecules. Subsequently, only three compounds were shortlisted as novel drug candidates that are ZINC09185674, ZINC03839141, and ZINC00631248. The binding energies of ZINC09185674, ZINC03839141, and ZINC00631248 are in descending order from lowest to highest, -6.4, -6.2, and -6.2 kcal/mol. To verify our docking analysis, molecular re-docking was performed for reference compound (ADP) using the same applied parameters. The results revealed the RMSD of 1.660 Å for redocking analysis validating the applied protocol of screening.
Bioinformatics subtractive genomics analysis is used to identify prospective therapeutic targets and candidates in this research. Genome and proteome pipelines are examined to prioritize effective antimicrobial agents that may be useful in halting the progression of the severe disease 'Campylobacteriosis" followed by experimental verification. This might aid in the treatment of periodontal or other C. concisus-related disorders, as well as the reversal of C. concisus-induced intestinal microbial imbalance infections. The method employed has the potential to be used as a general method for target identification, and hence may be used in drug development.
The preceding research identified different essential proteins that could be used as potential drug targets and candidates. Concurrently, in this work, cytoplasmic protein can be typically utilized to identify drug targets, whereas membrane-associated proteins can be employed for formulation of peptide vaccines [57]. As a result, different other computational methodologies and approaches in addition to this approach and experimental validations can be used in the future to develop potential therapeutic strategies not only against M. catarrhalis but also against other pathogens.

Conclusion
Notably, the analysis of genomes and proteomes of many pathogens has revolutionized the identification of therapeutic targets against pathogens. In this research, a subtractive genomic approach was employed to reveal beneficial findings in determining non-homologous essential druggable proteins against M. catarrhalis. These potential drug targets may aid in developing the novel antibiotics as well as potential drug targets that may be directed against M. catarrhalis, ensuring that the revealed targets are not the same as the host genome i.e. Homo sapiens in this case, to avoid any allergic responses or harmful consequences. By targeting these proteins functioning with novel drugs, candidates may be capable of damaging and eliminating infections from their respective hosts. The findings encompass all essential and potent drug targets in M. catarrhalis which could help future researchers to develop effective drug agents and vaccines against strain-specific M. catarrhalis BBH18.